5 research outputs found
Random deep neural networks are biased towards simple functions
We prove that the binary classifiers of bit strings generated by random wide
deep neural networks with ReLU activation function are biased towards simple
functions. The simplicity is captured by the following two properties. For any
given input bit string, the average Hamming distance of the closest input bit
string with a different classification is at least sqrt(n / (2{\pi} log n)),
where n is the length of the string. Moreover, if the bits of the initial
string are flipped randomly, the average number of flips required to change the
classification grows linearly with n. These results are confirmed by numerical
experiments on deep neural networks with two hidden layers, and settle the
conjecture stating that random deep neural networks are biased towards simple
functions. This conjecture was proposed and numerically explored in [Valle
P\'erez et al., ICLR 2019] to explain the unreasonably good generalization
properties of deep learning algorithms. The probability distribution of the
functions generated by random deep neural networks is a good choice for the
prior probability distribution in the PAC-Bayesian generalization bounds. Our
results constitute a fundamental step forward in the characterization of this
distribution, therefore contributing to the understanding of the generalization
properties of deep learning algorithms
Quantum Earth Mover's Distance: A New Approach to Learning Quantum Data
Quantifying how far the output of a learning algorithm is from its target is
an essential task in machine learning. However, in quantum settings, the loss
landscapes of commonly used distance metrics often produce undesirable outcomes
such as poor local minima and exponentially decaying gradients. As a new
approach, we consider here the quantum earth mover's (EM) or Wasserstein-1
distance, recently proposed in [De Palma et al., arXiv:2009.04469] as a quantum
analog to the classical EM distance. We show that the quantum EM distance
possesses unique properties, not found in other commonly used quantum distance
metrics, that make quantum learning more stable and efficient. We propose a
quantum Wasserstein generative adversarial network (qWGAN) which takes
advantage of the quantum EM distance and provides an efficient means of
performing learning on quantum data. Our qWGAN requires resources polynomial in
the number of qubits, and our numerical experiments demonstrate that it is
capable of learning a diverse set of quantum data
Quantum artificial intelligence : learning unitary transformations
Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 77-83).Linear algebra is a simple yet elegant mathematical framework that serves as the mathematical bedrock for many scientific and engineering disciplines. Broadly defined as the study of linear equations represented as vectors and matrices, linear algebra provides a mathematical toolbox for manipulating and controlling many physical systems. For example, linear algebra is central to the modeling of quantum mechanical phenomena and machine learning algorithms. Within the broad landscape of matrices studied in linear algebra, unitary matrices stand apart for their special properties, namely that they preserve norms and have easy to calculate inverses. Interpreted from an algorithmic or control setting, unitary matrices are used to describe and manipulate many physical systems.Relevant to the current work, unitary matrices are commonly studied in quantum mechanics where they formulate the time evolution of quantum states and in artificial intelligence where they provide a means to construct stable learning algorithms by preserving norms. One natural question that arises when studying unitary matrices is how difficult it is to learn them. Such a question may arise, for example, when one would like to learn the dynamics of a quantum system or apply unitary transformations to data embedded into a machine learning algorithm. In this thesis, I examine the hardness of learning unitary matrices both in the context of deep learning and quantum computation. This work aims to both advance our general mathematical understanding of unitary matrices and provide a framework for integrating unitary matrices into classical or quantum algorithms. Different forms of parameterizing unitary matrices, both in the quantum and classical regimes, are compared in this work.In general, experiments show that learning an arbitrary dxd² unitary matrix requires at least d² parameters in the learning algorithm regardless of the parameterization considered. In classical (non-quantum) settings, unitary matrices can be constructed by composing products of operators that act on smaller subspaces of the unitary manifold. In the quantum setting, there also exists the possibility of parameterizing unitary matrices in the Hamiltonian setting, where it is shown that repeatedly applying two alternating Hamiltonians is sufficient to learn an arbitrary unitary matrix. Finally, I discuss applications of this work in quantum and deep learning settings. For near term quantum computers, applying a desired set of gates may not be efficiently possible. Instead, desired unitary matrices can be learned from a given set of available gates (similar to ideas discussed in quantum controls).Understanding the learnability of unitary matrices can also aid efforts to integrate unitary matrices into neural networks and quantum deep learning algorithms. For example, deep learning algorithms implemented in quantum computers may leverage parameterizations discussed here to form layers in a quantum learning architecture.by Bobak Toussi Kiani.S.M.S.M. Massachusetts Institute of Technology, Department of Mechanical Engineerin